*! version 5.0
* 13 August 2018
* NIDS

* THIS IS A FOOD AND NON-FOOD EXPENDITURE DO FILE: 4 OF 14

*=====================================================================================================================================
* GLOBALS FOR DATA FILES, DO FILES AND VERSION SUFFIXES

* DEFINED IN "W1 Food_NonFood Expenditure - Master  Food_NonFood Expenditure do file  (1 of 14).do"

*=====================================================================================================================================
* SETTING UP STATA TO RUN DO FILES

clear
cap clear matrix
set more off 

version 12.0

*-------------------------------------------------------------------------------------------------------------------------------------
**********************************************************************
***			Food 1: Preparing for Imputation
**********************************************************************
***				CLEANING
**********************************************************************
use "$DataOUT\tempdata2.dta", clear

****For the purposes of Expenditure
replace e1_1=190 if a3==101793
replace e1_1=400 if a3==103454
replace e1_1=800 if a3==107390
replace e1_6_7=600/12 if a3==103769
replace e1_3_29=530 if a3==107836

**more dodgy
replace e1_1=200 if a3==107587
replace e1_1=. if a3==103416
replace e1_1=700 if a3==112115
replace e1_1=. if e1_1==0
replace e1_1=1000 if a3==103001
replace e1_1=180 if a3==113872
replace e1_5_1=. if a3==103251
replace e1_1=2000 if a3==111387
replace e1_1=180 if a3==113872
replace w1_h_expnd=180 if a3==113872

**********************************************************************
***				FOOD ITEM AGGREGATION
**********************************************************************

*Generates a raw aggregate of food consumption: Sums all possible reported food consumption before imputation
egen totalfood =rowtotal(e1_4_* e1_5_* e1_6_* e1_3_*)

*Generates a total consumption value for each food item. From here on we aggregate food consumption at the item level (we do not distinguish between 
*different types of food consumption when calculating our consumption aggregates). 
forvalues a=1/32{
egen total`a' =rowtotal(e1_3_`a' e1_4_`a' e1_5_`a' e1_6_`a')
}

*Replaces all zero's with missing values. In this data set there are no legitimate zero's (in the sense that the household has reported zero consumption
*for every source of consumption for an item which they claimed to consume), rather this simply adjusts for the fact that the STATA function 'rowtotal' (which
*is used above) sums missing values as zero's; so 4 missing values sum to zero, when this should be missing
forvalues a=1/32{
replace total`a'=. if total`a'==0
}

**********************************************************************
***				PSU AND DISTRICT LEVEL DATA
**********************************************************************

*These data will be most commonly used for cell median imputation, by PSU and district data. This method uses checks to see that response rates
*are high enough to do valid cell median imputation. These response rates are calculated

*Generates PSU totals, observation counts, means and medians of consumption for each food type
forvalues a=1/32{
quietly egen psutot`a'=total(total`a'), by(w1_cluster)
quietly egen psucount`a'=count(total`a'), by(w1_cluster)
quietly gen psumean`a'=psutot`a'/psucount`a'
quietly egen psumedian`a'=median(total`a'), by(w1_cluster)
}

**********************************************************************
*Generates the district observation count and median of consumption for each food type
forvalues a=1/32{
egen dismedian`a' = median(total`a'), by(w1_dc2011)
egen disfcount`a'=count(total`a'), by(w1_dc2011)
}

**********************************************************************
*Generates the rate of response for each food item, within psu and then within districts. This is done by Calculating the number of people claiming to 
*consume that food, and dividing the number consumption observations by this 'size' value

forvalues a=1/32{

quietly gen f`a'counter = e1_2_`a' if e1_2_`a'==1				//dummy variable for the item being consumed

egen psufsize`a' =count(f`a'counter), by(w1_cluster)			//by psu
gen psufrate`a'=psucount`a'/psufsize`a'

egen disfsize`a' = count(f`a'counter), by(w1_dc2011)				//by district
gen disfrate`a'=disfcount`a'/disfsize`a'
}

*-------------------------------------------------------------------------------------------------
save "$DataOUT\tempdata3.dta", replace


